Semantic Classification, Keyword Mining and Search Space Optimization for digital ecosystems

نویسندگان

  • Nikunj Yadav
  • Yanu Gupta
  • Manish Kumar
  • Ratna Sanyal
چکیده

The volume of documents in the digital repositories numbers in thousands and is increasing constantly, in such a scenario it becomes a very important issue to organize and retrieve these documents in a way that relates to the human mind. In this paper, we present a novel approach to classify the documents in a digital repository and find the semantically significant keywords related to those documents to make the organization and the retrieval of the documents faster and more efficient. We approach this problem using Probabilistic Latent Semantic Analysis with incomplete training data to organize them and mark the relevant keywords. This approach makes the classification faster and instead of the unlabeled clustering gives classification with well defined topics relating to human logic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

FUZZY GRAVITATIONAL SEARCH ALGORITHM AN APPROACH FOR DATA MINING

The concept of intelligently controlling the search process of gravitational search algorithm (GSA) is introduced to develop a novel data mining technique. The proposed method is called fuzzy GSA miner (FGSA-miner). At first a fuzzy controller is designed for adaptively controlling the gravitational coefficient and the number of effective objects, as two important parameters which play major ro...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Web Page Structure Enhanced Feature Selection for Classification of Web Pages

Web page classification is achieved using text classification techniques. Web page classification is different from traditional text classification due to additional information, provided by web page structure which provides much information on content importance. HTML tags provide visual web page representation and can be considered a parameter to highlight content importance. Textual keywords...

متن کامل

Modified CLPSO-based fuzzy classification System: Color Image Segmentation

Fuzzy segmentation is an effective way of segmenting out objects in images containing both random noise and varying illumination. In this paper, a modified method based on the Comprehensive Learning Particle Swarm Optimization (CLPSO) is proposed for pixel classification in HSI color space by selecting a fuzzy classification system with minimum number of fuzzy rules and minimum number of incorr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JMPT

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2010